A Geometrical Data-parallel Language 1

نویسندگان

  • Jean-Luc Dekeyser
  • Dominique Lazure
  • Philippe Marquet
چکیده

The help project proposes a model of data-parallel programming allowing a programmer to develop an algorithm the nearest of his thought. Usually, for many parts of a data-parallel program, the manipulations of data could be modelized as geometrical migrations inside a cartesian reference space. We de ne the language C-help in the frame of explicit data-parallel languages, the communications and the computations are separated, moreover any vector description is banished. This model and the associated languages are based on the hyper-space notion, and the algorithm development follows an original semantic of computations limited to a set of hyperspace points. The hyper-space is not only a compilation-oriented concept but consists in a multi-dimentional virtual array integrated at the programming model and provides a referential for any object access. Introduction The parallel programming techniques are divided into two elds: the task-parallelism and the data-parallelism. Those two techniques coexist and the parallelism specialist community is divided into two clans. Nevertheless a connection between the two philosophies is observed: the task parallelism is selected for the execution model (supported by the architecture) whereas the data-parallelism becomes the programming model (supported by the compilers). Some recent works have allowed the emergence of two standard programming language: HPF [For93] and HPC. With those tools, it is possible to manage e ciency the power of parallel architectures but no general methodology is provide for the scientist to develop parallel algorithms. Those new standards are based on a standard languages (Fortran , C) on which some features are added, independently of the programming model. For the C-based languages, the speci c parallel features are often issued directly from the architecture; while Fortran-issued languages do not provide deterministic criteria of optimization (one can optimize its program only after execution). Hence, the data-parallel algorithm development needs some tools to simplify the architecture power access. The aim of the help project is to propose a general methodology integrating all steps of the parallel development. This methodology is particularly well-suited to scienti c algorithms. It relies on the three following principles: The data-parallel developing tools uni cation The design, the translation, and the debugging of a data-parallel program have to be grounded on a single concept: the geometrical model. The data-parallel programming deals with some computations applied in parallel on some homogeneous data structures. Those structures are often multi-dimentional arrays and the descriptions of those objects can be translated in some geometrical moves inside a cartesian space. The resolution methodology, that has been elaborated during the algorithm design, has to be directly supported by the syntax and the semantic of the parallel programming language. Also, the debugging tool has to lay on this methodology. The communications and the parallel computations explicitation To be e cient, the parallel programming has to be explicitly expressed by the programmer. Like the vector architectures, the massively parallel machines require explicit communications and explicit calls to parallel computations. The compiler has not to take into account the object migrations in the network. As far as e ciency is concerned, those migrations are too much in uent to be programmerhidden. The communications and the computations separation While the communications and the computations are explicit, a parallel programming legibility needs those two fundamental operation separation at the model and the language levels. Therefore, the generated code becomes more e cient. help provides an explicit and imperative data-parallel language. Thus, the communication and the computation phases are clearly distinguished. The help-based languages take into account this distinction by providing some explicit communication operators. In opposition to High Performance Fortran [For93], every communication generated by a nonaligned-operand computation is banished in the help model. In this article, we detail every steps of a scienti c algorithm development using the help model. We present the main characteristics of the help implementation inside the C-help language. The square matrix inversion by the Gauss-Jordan algorithm is developed as an example; it shows the adequation of help model to the matrix computation algorithms. 1 Data-parallel think Before writing a parallel program, one has to choose a strategy in order to parallelize his problem. To support this choice, we propose the geometrical model. 1.1 The geometrical model The scienti c data-parallel algorithms are mainly laying on some regular structure manipulations (vectors, matrix...) on which some global or regular-subset-limited computations (for example, limited to a matrix row...) are applied. The algorithm description is then simpli ed if the programming model is restricted to the manipulation of geometrical objects and their geometrical subsets. For example, in many numerical algorithms, a matrix has to interact with one or several of its rows or columns. In this case, and in order to avoid implicit communications, the data-parallel model has to adopt a primitive of replication to extend the vector and make it conform to the matrix. The programming model able to implement those algorithms consists in de ning a work space, moving data-parallel objects inside this space by geometrical primitives and then triggering computations locally to the points of this space. Those computations may be triggered in parallel because each point owns its elemental operands. A geometrical approach allows the programmer to visualize his algorithm inside the space. The data-parallel model understanding is made easier by the association of the geometrical visualization with the geometrical expression of the algorithm. 1.2 Example: Gauss-Jordan, the algorithm The Gauss-Jordan square matrix inversion algorithm is made up of three phases: the initialization, the diagonalization and the inverse computation. 2 Initialization The square matrix A has to be inversed. A and a same-size identity matrix are coupled together. The treatments applied to A will be simultaneously computed on this second matrix. The \union" of the two matrix is called M . 0 0 1 1 1 1 1 1 1 1 1 1 1 A Diagonalization In order to zero the elements of the p-row, we apply to each r-row, with r 6= p rowr = rowr rowp Arp=dp While the communications are explicit, we have to di use the pivot row (rowp) on every other row, and to di use the pivot column (colp) on every other column, in order to make Arp visible along r-row. This treatment may be realized parallel. d1d2 dp dp 1 0 0 rowr colp rowp Inverse computation After the diagonalization phase, each matrix row has to be divided by the same-index diagonal element value. rowr = rowr=dr In order to di use each element of the diagonal to the right part of the matrix M , we use the geometrical primitives of projection and of replication. This communication treatment is realized in parallel for each row. 3 A 1 0 0 d1d2 dr dN This basic example has demonstrated the real adequation of a data-parallel model using 2-dimensional space and geometrical primitives of communication for the general matrix computation purposes: some geometrical primitives provide the data migrations. Then, the data are available on the points where computation treatments have to be triggered. 1.3 Hyper-Space: A geometrical reference for the HELP model The interacting object manipulation is always hyper-space internal. An hyper-space is a geometrical set of points where the parallel objects (matrix, vectors) will be positioned and moved. The help hyper-space point is the basic entity for computation treatments. All computation on the data-parallel objects are triggered at the point level on the local data. Several hyper-space may be de ned for a single algorithm, for di erent speci cations of size or data shape. The hyper-space regroups and aligns together the same-distribution objects. The programmer speci es the data distribution of its program by the expression of some directives linked to the hyper-space notion. The same-hyper-space objects are so forth allocated with the same distribution algorithm. An hyper-space is de ned as a cartesian reference of positive-coordinate points. Some informations are linked to each hyper-space: The size of each dimension (number of points). Some priority orders between the hyper-space dimensions, allowing the programmer to make either the parallelism or the communications provileged; two concepts are provided for this feature: The elementary block speci es a geometrical set of hyper-space points. The communications inside two points of the same block will never generate any physical communication. Particularly, a full dimension may be grouped inside a unique elementary block of the same size. During the projection, the help compiler will allocate this dimension in memory. The parallelism priority tree A priority order between the hyper-space dimensions allows the programmer to privilege the physical projection of a dimension on the processors network. Several examples of this parallelism priority tree are developed in the part 2. 4 Secondary dimensions could be de ned as the composition of some hyper-space dimensions. With this feature, one can manipulate non-regular objects like the diagonal of a matrix. Those secondary dimensions are also used with the geometrical primitives of communication. The data-parallel objects (DPO) are necessarily linked to an hyper-space. Their shape and their position are dynamic during the execution, but the DPO association to an hyperspace is xed. Those objects are multi-dimensional arrays, de ned in relation of the hyperspace dimensions, either primary or secondary dimensions. Each point included in the object geometry holds one of its elements. 1.4 Two programming levels The help thinking model clearly separates communications from computations. Like this model, two programming levels are provided by help languages: the microscopic level where the local-to-point computations are triggered; and the macroscopic level where the communications are modelized by some geometrical primitives calls. 2 Data-parallel writing After the algorithm design at the model level, one has to translate it into a data-parallel language. Such a language is derived from a classical language (C or Fortran ) in which some features derived from the model are included. One can also directly program its algorithm with HelpDraw. HelpDraw is a graphical editor derived from the help model, providing all geometrical features of the hyper-space notion. In order to ease the data-parallel language learning, a sequential expression in the usual language is kept. Only the data-parallel part of the algorithm is written with speci c constructors. Those constructors are identical for C-help and Fortran-help . 2.1 The C-HELP language Hyper-Spaces declaration For each hyper-space dimension, the programmer has to declare: A name, in order to increase the C-help code legibility. The size (number of points). The elementary block size. By default, this size is 1. The star symbol ('*') makes the dimension to be mapped in memory. The parallelism priority is speci ed by the expression of a priority tree. The most priority dimensions (appearing at the highest level of the tree) are mapped onto the physical network, in order to privilege the parallelism. If two dimensions have the same priority (the same level in the tree), the priority is assigned from the left to the right. By default, the priority tree is one-level and the dimensions appear in the declaration order. 5 Examples of priority tree A 6 2 hyper-space is mapped onto a 2-dimentional grid machine with 3 2 processors. Two lays are necessary to map two points by physical processor. Three di erents mapping could be obtain with di erent priority trees: hspace plan [ x = 6 , y = 2 ] (x,(y)); (4,1) (2,2) (2,5) x y (2,3) (2,6) (2,4) (1,1) (3,1) (6,1) (2,1) (5,1) (1,2) hspace plan [ x = 6 , y = 2 ] ((x),y); (1,1) (1,2) (2,1) (2,2) (3,1) (3,2) (4,1) (4,2) (5,1) (5,2) (6,1) (6,2) x y hspace plan [ x = 6 , y = 2 ] (x,y); (1,1) (6,2) (1,2) (2,2) (2,1) (3,1) (4,1) (5,1) (6,1) (3,2) (4,2) (5,2) x y Secondary dimensions The secondary dimensions are obtained by the composition of several hyper-space dimensions. This composition can be operated between some primary dimensions or secondary dimensions, if every primary dimension appearing in the expression is not used more than once. The secondary dimension sizes can not be expressed by the programmer. hspace plan [ x = 100 , y = 100 , d = (x , y) ]; 6 x y d The hyper-space declaration follows the grammar: 6 = hspace `[' `]' [ ] = `,' | = `=' = [ ] | = `(' `)' = , | , = = `(' `)' = `*' | = `(' `)' = `,' | = | Data-Parallel Object declaration By default, every DPO is dynamic for its size and its position inside the hyper-space. One can declare a DPO static using the key-word steady if its size and position are known at compil-time. Such a DPO becomes static and could not migrate over the hyper-space. A DPO is built over the primary and secondary dimensions ; but all the primary dimensions used to de ne a secondary dimension could not be used at the same time with this secondary dimension to allocate a DPO. The DPO declaration follows the grammar : = [ steady ] dpo = `[' `]' | = `,' | = [ `=' ] = | `*' | `:' | `:' `*' | `;' = `conform' `(' ')' For a DPO declaration, the references to the hyper-space declaration is expressed either with the dimension name, or with the hyper-space primary dimensions declaration order. In order to specify the position and the size, one can declare: Only the position; the DPO dimension is by default one-sized. `*'; the DPO dimension size is complete (the same size as the hyper-space). A couple lower bound `:' upper bound. A couple lower bound ';' length. 7 hspace cubicspace [ x=100 , y=100 , z=100, d=(x,y) ] ; dpo float cubicspace [ x=40;30 , y=80;20, z=1 ] cube ; dpo float cubicspace [ x=30:100, y=10 , z=* ] plan ; dpo float cubicspace [ x=1 , z=80 , y=* ] line ; dpo float cubicspace [ x=1 , y=50 , z=50] point ; dpo float cubicspace [ x=30 , d=1;70 , z=1 ] diag ; dpo float cubicspace conform(line) sameline; 6 x y z cubicspace line diag point plan cube The microscopic level The microscopic treatments are computed locally to the hyper-space points. Such treatments consist in scalar operator or function applications on the DPO values belonging to the same point. All the scalar operators of the host language are extended to the DPO manipulation. In order to make two DPO interacting, the help model needs the programmer to insure the DPO conformity (same position and same shape). While the projection directives are linked to the hyper-space notion, the logical conformity of two DPO implies the physical conformity of data. There is not any implicit communication in C-help, making the code generation simpli ed and more e cient. The conformity domain By default, two interacting DPO have to be conform; the conformity domain is de ned as the set of points where those DPO are allocated. 6 x y A B C A = B + C /* legal */ conformity domain = A = B = C 6 x y A B C A = B + C /* illegal */ 8 Constraint domain, conformity rule Sometimes, the computation has to be constricted to a set of common points to non-same-shape or non-same-position DPO. The operator and the constructor on(DPO) limit the conformity domain to the points where DPO is allocated. The interacting objects inside the in uence of an on(DPO) must embed DPO. One can successively reduce the constraint domain, nesting on constructors. 6 x y A B A + B illegal : A and B are not conform. 6 x yA B on(B) A + B legal : A includes B. The conformity rule must be applied to all the operands of all the microscopic operators (except injective assignment, see below). This rule is: Two DPO are conform: 1. If the conformity domain is expressed by a conformity constructor on, each DPO domain has to include this domain. 2. If there is no explicit conformity domain, all the DPO must have the same shape and the same position inside the hyper-space. The scalar data are considered conform to any DPO. Moreover, a DPO appearing as an argument of a nested on has to be included in the previous conformity domain (successive reductions of the conformity domain). 6 x y A B C D on (B) { /* conformity domain = allocation domain of B */ A + B; /* correct : A and B include the conformity domain */ on (C) { /* correct : the conformity domain includes C */ /* the conformity domain becomes the domain of C */ D + A; /* correct : D and A include the conformity domain */ } /* the conformity domain back to B */ D + B; /* error: D does not include B */ } 9 Masked domain Like most of the data-parallel languages, C-help provides the possibility to mask some points of the conformity domain by the use of the operator where(DPO_EXPR). The DPO appearing in the expression DPO_expr and inside the where-block expressions have to be conform. The new masked domain is composed of the points where the DPO_expr expression evaluation result is true. The masked domain is a sub-set of the conformity domain. The microscopic operations nested in a masked domain are only computed on the masked domain points. Moreover, a macroscopic operator application or a function call masks the current conformity domain (cf. infra). Afterwards, the masked domain is inversed for the elsewhere block execution. /* A, B, C, and D are conform */ where (A!=0) /* mask1 : (A!=0) */ where (C!=0) /* mask2 : (mask1) && (C!=0) */ A = ( (B / C) + D ) / A ; elsewhere /* mask3 : not(mask2) */ A = D / A; elsewhere /* mask4 : not(mask1) */ A = 1 ; The C-language operator expr ? then_expr : else_expr is extended to the DPO operands. This constructor is valid if expr, then_expr and else_expr satisfy the conformity rule. The resulting DPO is allocated on the conformity domain. Association The association operator = `.' (1) | `.' `(' `)' (2) `:' | ... The application of a macroscopic primitive list produces a temporary DPO which the size and the position depend of the source DPO and the primitive features. One has to notice that the including conformity domain is masked by the dot application. The expression evaluation is computed ignoring the including the conformity domain. The conformity rule is applied locally to this expression. Eventually, a Macroscopic Control Domain () identi es a sub-set of the source DPO points on which the primitive list is applied; the and expressions produce two conform results. The resulting DPO is conform to the resulting macroscopic primitive application on the source DPO, without consideration for the MCD. The resulting value for a point outside of the MCD target is obtain by the expression evaluation; therefore, those expression must be conform to the resulting DPO. A = (C!=0 ?(B/C):0) . (C!=0) trans(x,100) : 0 ; The only values of B/C on points where C is non-zero are transfered. 6 x y B C A 12 6 x y ? shifttor(x,2) 6 x y  ? shifttor(x,-2) 6 x y 6 ?flip(x) Example : Gauss-Jordan, the diagonalization A diagonalization step of the matrix M is realized by the expression: M -= M.extract (y,i) .expand (y) * M.extract (x,i) .expand (x) / M.scalar (i,i) ; Hence, the algorithm is directly translated from the geometrical numerical thinking to the geometrical help model: The pivot row is communicated to every other rows. The pivot column is communicated to every other columns. The pivot value is scalar converted (to obtain the all hyper-space visibility). Then, the new value of theM matrix element is computed in each point in parallel. This computation consists of a linear combination of rows to zero the pivot column. In order to avoid to zero the pivot itself, this computation is triggered inside a mask block. The HELP functions The host language function notion is extended to the DPO manipulation. A new type of function is de ned: the microscopic functions. Microscopic functions The microscopic operator features are extended by the microscopic functions. Similarly to those operators, those functions are computed locally to the conformity domain active points and do not generate any communication. The microscopic functions are declared with the micro keyword and written with the host language syntax, allowing only the calls to microscopic functions. During the call of such a function, the DPO e ective parameters have to verify the conformity rule. This microscopic function is applied on each point of the conformity domain, eventually of the masked domain. micro int gcd (a,b) int a, b ; { if (a > b) return gcd (a-b, b) ; if (a < b) return gcd (a, b-a) ; 13 return a ; }dpo int planar [*,*] A, B ; A = gcd (A, B) ; where (B>30) A = gcd (A, 10) ; The arithmetical functions of the standard libraries are implicitly extended to microscopic functions. General functions The hyper-space is a referential for each access to a DPO. Hence, for a DPO parameter of a function, this referential is needed to access to its geometry and its element values, therefore the hyper-space of a DPO parameters has to be passed as an argument. In order to provide a powerful model of function (or library) calls, the sub-hyper-space notion is de ned. An hyper-space parameter is considered as a sub-hyper-space of the calling hyper-space. For a function call, the programmer as to explicitly express the position of the formal sub-hyper-space inside the calling hyper-space. The DPO parameters, the local DPO and the result may be positioned inside this formal sub-hyper-space. dpo float planar MatMul(planar,M1,M2) hspace planar [x,y]; dpo float planar [*,*] M1,M2; { int iter; dpo float planar[*,*] res; res=0; M1 = M1 . shifttor(x,-1*(ipoint(M1,y)-1)); M2 = M2 . shifttor(y,-1*(ipoint(M1,x)-1)); res = M1 * M2; for (iter=1;iter<SizeDim(M1,x);iter++) { M1 = M1 . shifttor(x,-1); M2 = M2 . shifttor(y,-1); res += M1*M2 ; }return res; }hspace cubic [a=100,b=100,c=100]; dpo float cubic [a=31;70 ,b=75 , c=1;70] Mat1,Mat2,Mat3; Mat3 <MatMul(cubic(a(31:*),b(75),c),Mat1,Mat2); 14 6 a b c x y A call to a general function is independent of the current context, specially the current activity is not token into account for this call. The e ective parameters evaluation masks the embedding conformity domain. If the hyper-space of the DPO parameters is in the function scope, there is no need for this hyper-space to be passed as an argument (cf example of Gauss-Jordan). 15 Example: Gauss-Jordan /**************************************************************************/ /* Gauss-Jordan algorithm. Square matrix (N * N) inversion. */ #define N 100 hspace planar [ x = 2*N , y = N , d = (x,y) ] ; #define row(i) extract(y,i) #define col(i) extract(x,i) #define diag(i) scalar(i,i) dpo float planar GaussJordan (A) dpo float planar[ x=1:N, y=* ] A ; { int i ; steady dpo float planar [ x=*, y=* ] M ; dpo float planar [ x=1+N;N, y=* ] res ; dpo float planar [ x=1, d=1:N ] D ; /*************************** Initialization ******************************/ /* The assignment is injective, the conformity has not to be verified. */ M = A ; /* ipoint(DPO,x) returns each point coordinate onto x dimension. */ M = ipoint(res,x)==ipoint(res,y) ? 1 : 0 ; /*************************** Diagonalization *****************************/ for (i=1;i<N;i++) where (ipoint(M,y) != i) M -= M.row(i).expand(y) * M.col(i).expand(x) / M.diag(i) ; /*************************** Inverse computation *************************/ on (D) D = M ; on (res) res = M / D.exchange(d,y).trans(x,N).expand(x,N) ; return res ; } 16 2.2 HelpDraw: a programming environment In order to interactively program data-parallel algorithms, we have de ned HelpDraw, a graphical environment allowing the visualization of the help macroscopic primitives [BDM93]. The rst HelpDraw feature is a dual help code editor. The rst part of this editor is a standard text editor; whereas the second one is dedicated to the graphical edition of the macroscopic algorithm parts. A directly DPO manipulation interface is provide by HelpDraw and allows the programmer to graphically handle the DPO, via the mouse, some menus or dialog boxes. The help code providing from those manipulations is automatically produced and linked with the text editor. HelpDraw provides a demonstrational aspect, in order to make easier for the programmer the repetitive manipulations of DPO. For example: to directly reach a known goal, such as obtain the two-DPO conformity; or to apply several times the same macroscopic DPO manipulation. 3 Related works In order to distinguish the data-parallel languages, several criteria are taken into account: Basic language Data-parallel languages extend classical language. Data-Parallel object keyword To specify an object to be data-parallel, a data-parallel language based on C often provide a keyword, whereas the Fortran extensions implicitly de ne array as parallel construction. Abstract machine An abstract machine provides a support for the parallel object de nition and the realization of operation such as communications. Object virtuality A language provides the object virtuality if a data-parallel object may be declared independently of the target machine size. Access to machine Some data-parallel language provide both virtuality and access to the physical machine characteristics. Object dynamicity An object is dynamic if its size and its shape may be changed during the execution. Heterogeneous alignment We call heterogeneous alignment the alignment of two objects of di erent shapes. Explicit communications In order to generate the communications, a data-parallel language can either explicitly express those communications, or make interact two objects by di erent description of those objects, in this case, the communications are implicit. Explicit distribution speci cation Some data-parallel languages provide the possibility to specify the mapping of the abstract machine onto the physical processors. The gure 1 describes the main features of some well-known data-parallel languages: 17 Fortran extensions Fortran -90 [MR90, ANS91], CM-Fortran [Thi90], MP-Fortran [Mas91a], Fortran-D [FHK+91], HPF [For93]. C extensions C* [RS87, Fra91], MPL [Mas91b], Hyper-C (POMP-C) [Par92, Hyp93]. Others extensions ACTUS [Per79, PCM83], PARALLAXIS [Br a89, BBES91], *Lisp [Thi91]. D a t a p a r a l l e l

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Evaluation of a Ranque-Hilsch Vortex Tube with Optimum Geometrical Dimensions

 A brass Vortex Tube (VT) with interchangeable parts is used to determine the optimum cold end orifice diameter, main tube length and diameter. Experim...

متن کامل

Impact Analysis of Variation in Geometrical Features on Intrinsic Characteristics of Capacitive Micro-machined Ultrasonic Transducers

Capacitive Micro-machined Ultrasonic Transducers (CMUTs) are the ultrasonic devices which produce better features in contrast to piezoelectric transducers. The intrinsic parameter of CMUT varies with the variation in geometrical dimension of the device. The cavity height and the radius of the CMUT with circular membrane is varied in the lumped parallel plate model for its impact on the paramete...

متن کامل

Refinement of data parallel programs in PEI

Parallel programs mainly diier from sequential ones in that they include geometrical aspects involved by the hardware architecture. We present in this paper the Pei formalism, which enables to take into account both the geometrical and functional aspects of programs. It provides a reenement calculus mainly used to transform the geometrical characteristics of parallel programs, and we show how i...

متن کامل

Data-parallelism versus Functional Programming: the Contribution of Pei

A lot of research works have been done to examine connections between data-parallel and functional programming, as 8, 2] for example, who deene the denotational semantics of a data-parallel language. Some others show how interesting it is to keep or associate a geometry with data-parallel objects: for example, in the automatic parallelization area, a change of basis makes a space-time mapping e...

متن کامل

Geometrical aspects of first-order optical systems

We reconsider the basic properties of ray-transfer matrices for firstorder optical systems from a geometrical viewpoint. In the paraxial regime of scalar wave optics, there is a wide family of beams for which the action of a raytransfer matrix can be fully represented as a bilinear transformation on the upper complex half-plane, which is the hyperbolic plane. Alternatively, this action can be a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994